The transcriptomic and epigenetic alterations in type 2 diabetes mellitus patients of Chinese Tibetan and Han populations

Background Due to the distinctive living environment, lifestyle, and diet, the Tibetan community in China has the lowest prevalence of T2DM and prediabetes among numerous ethnic groups, while Han community shows the highest statistic. In this study, we aim to conclude the clinical manifestations of both Tibetan and Han T2DM patients and their association with transcriptomic and epigenetic alterations. Methods A cross-sectional study including 120 T2DM patients from Han and Tibetan ethnic groups were conducted between 2019 to 2021 at the Hospital of Chengdu University of Traditional Chinese Medicine. The various clinical features and laboratory tests were recorded and analyzed between the two groups. The genome-wide methylation pattern and RNA expression were determined by Reduced Representation Bisulfite Sequencing (RBBS) and Poly (A) RNA sequencing (RNA-seq) from leucocytes of peripheral blood samples in 6 Han and 6 Tibetan patients. GO analysis and KEGG analysis were conducted in differentially expressed genes and those with differentially methylated regions. Results Compared to Han, Tibetan T2DM individuals intake more coarse grains, meat and yak butter, but less refined grains, vegetables and fruit. They also showed increased BMI, Hb, HbA1c, LDL, ALT, GGT and eGFR, and decreased level of BUN. Among the 12 patients in the exploratory cohort, we identified 5178 hypomethylated and 4787 hypermethylated regions involving 1613 genes in the Tibetan group. RNA-seq showed a total of 947 differentially expressed genes (DEGs) between the two groups, with 523 up-regulated and 424 down-regulated in Tibetan patients. By integrating DNA methylation and RNA expression data, we identified 112 DEGs with differentially methylated regions (overlapping genes) and 14 DEGs with promoter-related DMRs. The functional enrichment analysis demonstrated that the overlapping genes were primarily involved in metabolic pathways, PI3K-Akt signaling pathway, MAPK signaling pathway, pathways in cancer and Rap1 signaling pathway. Conclusion Our study demonstrates the clinical characteristics of T2DM differ subtly between various ethnic groups that may be related to epigenetic modifications, thus providing evidence and ideas for additional research on the genetic pattern of T2DM.


Introduction
The main features of type 2 diabetes mellitus (T2DM) include hyperinsulinemia, insulin resistance (IR) and islet cell damage, which can reach 50% at the time of diagnosis (1). With a high-energy diet, decreased physical activity, and an increase in obesity, the incidence of diabetes is rising globally, along with the rate of disability and mortality. People who have T2DM experience vascular and neurological consequences, as well as life, psychological, and financial stress. The diabetic population will predictably reach 147 million by 2045 (2). Most diabetes is a complex disease caused by a combination of multiple genes and environmental factors. Genetic factors are present in approximately 25% to 69% of people with T2DM worldwide (3) and over 560 genetic loci are identified to be relevant (4).
Epigenetics, including DNA methylation, histone modifications and microRNAs, lead to changes in gene function based on mitosis and meiosis without alteration in DNA sequence (5), in which DNA methylation has been recognized to be an important genetic factor contributing to T2DM (6). DNA methylation refers to the S-adenosyl methionine (SAM), as the methyl donor, transfers the activated methyl group to carbon 5 of the cytosine-phosphate-guanine (CpG) by the catalyzation of DNA methyltransferases (DNMTs). In general, gene expression is opposite to the level of methylation in the promoter region, which means low methylation levels result in up-regulation of gene expression, whereas high methylation results in down-regulation of expression (7,8). As DNA methylation is reversible and can be interfered with, some chemicals can be used as targets to modify DNA methylation (9), providing a new perspective for T2DM treatment. Previous studies have shown that many genes are related to islet function, such as PDX1 (10), PPARGC1A (11), INS (12), GLP1R (13) and KCNQ1 (14), have been associated with the development of T2DM. Meanwhile, methylome-wide association studies (MWAS) for T2DM have identified differentially methylated sites (DMSs) in TXNIP (15), PHOSPHO1 (16), SREBF1 (17), ABCG1 (17), SOCS3 (18), and CPTA1 (19).
Environmental factors such as diet, exercise and obesity can also alter the epigenome. Tibetans are a distinct ethnic group in China that have historically lived at high altitudes. They primarily reside in the Tibetan Autonomous Region (TAR), as well as the provinces of Qinghai, Sichuan, Yunnan, and Gansu in China. Although highlanders had a lower incidence of diabetes, the number has quickly risen as a result of greater longevity and lifestyle changes (20,21). According to nationwide research, the Han Chinese population had a 14.7% prevalence of diabetes and a 38.8% prevalence of prediabetes, whereas the Tibetan community had the lowest prevalence of both conditions at 4.3% and 31.3%, respectively (22). Lifestyle changes, particularly in calorie intake, are associated with the development of diabetes, possibly through epigenetic mechanisms (23,24). This study aimed to demonstrate the differences in clinical characteristics between Tibetan and Han T2DM patients and to explore the transcriptomic and epigenetic alterations in the two groups. There was no kinship between the included study subjects and three consecutive generations for each patient are the same ethnic group. The exclusion criteria include 1) other types of diabetes; 2) having immune system diseases; 3) any types of tumors; 4) acute and chronic infections; 5) psychoneurological disorders; 6) recent use of drugs that affect lipid metabolism; 7) having liver or kidney failure or severe heart diseases; 8) disagreeing to participate in the study.

Subjects enrolled for exploratory cohort
Among the cross-sectional cohort, 6 Tibetan and 6 matched Han T2DM patients were selected for the exploratory cohort of RNA expression and DNA methylation. These two groups were selected by a matched pairs design based on shared characteristics including age, gender, weight, height and duration of T2DM to control lurking variables.

Clinical data collection
The general information for all patients included age, gender, body mass index (BMI), drinking and smoking history, family history of diabetes, the duration of T2DM, food intake, systolic blood pressure (SBP), diastolic blood pressure (DBP) and hemoglobin (Hb). In addition to HbA1c test, a standard 2-h OGTT test was performed by using a 75g glucose load to assess the patient's islet function and glycemic control. Plasma glucose and insulin level at 0 (fasting), 1, 2 and 3-hour postprandial blood glucose (PBG) were measured, and blood C-peptide was measured at 0 and 2-hour postprandial. The biochemical analysis includes the total cholesterol (TC), triglyceride (TG), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), alanine aminotransferase (ALT), aspartate aminotransferase (AST), alkaline phosphatase (ALP), g-glutamyltransferase (GGT), total bile acid (TBA), direct bilirubin (DBIL), blood creatinine (BCr), blood urea nitrogen (BUN), blood uric acid (BUA), were detected to estimate blood lipids, liver function, and kidney function. The Cockcroft-Gault equation was used to determine the estimated glomerular filtration rate (eGFR) (26).

Blood sample and DNA extraction
In the exploratory cohort, 3ml EDTA-treated peripheral blood sample of each participant was collected and stored in -80°C. Genomic DNA was extracted from peripheral blood using magnetic universal genomic DNA kit (TIANGEN Biotech (Beijing) co., Ltd). DNA concentration and quality were measured by Nanodrop.

Reduced representation bisulfite sequencing
1μg genomic DNA was digested using MspI enzyme for 16 hours at 37°C. After digestion, libraries were constructed as the Illumina Pair-End protocol with some modifications. Briefly, purified digested DNA was subsequently treated with a mix of T4 DNA polymerase, Klenow Fragment and T4 polynucleotide kinase to repair, blunt and phosphorylate ends. The DNA fragments were subsequently 3' adenylated using Klenow Fragment (3'-5' exo-) and following with ligation to adaptors synthesized with 5'-methylcytosine instead of cytosine using T4 DNA Ligase. the DNA was purified using QIAquick PCR purification kit (Qiagen) after reaction of each step. After purification, the library was subjected to 40°C for 30 min treatment in a thermo cycler with the lid heated at 57°C. After that, centrifuged the reaction mixture at 14,000 X g for 10 min and then transferred the supernatant into a new 0.2 ml PCR tube for the further bisulfite treatment, respectively. Bisulfite conversion treatment was performed using a ZYMO EZ DNA Methylation-Gold Kit (Zymo research, Irvine, CA, USA) according to the manufacturer's instructions. The final RRBS libraries were generated by PCR amplification using adapter compatible barcode primers, quantified by an Agilent 2100 Bioanalyzer (Agilent Technologies) and real-time PCR assay and then sequenced by Illumina Hiseq.

Methylation calculation and identification of DMRs
Low-quality reads that contained more than 5 'N's or had a lowquality value for over 50% of the sequence (Phred score< 5) were filtered. The sequencing reads of the samples were aligned to the human reference genome (hg19) using BSMAP (Version 2.74) (27). The methylated CpG (mCG) sites were identified following a previously described algorithm (28). The methylation levels for each sample were calculated using in-house Perl scripts.

RNA library construction and sequencing
Total RNA was extracted from cells using Trizol (Invitrogen) according to the manufacturer's protocol, and ribosomal RNA was removed using the Ribo-Zero ™ kit (Epicentre, Madison, WI, USA). Fragmented RNA (the average length was approximately 200 bp) was subjected to first-strand and second-strand cDNA synthesis followed by adaptor ligation and enrichment with a low cycle according to instructions of NEBNext ® Ultra ™ RNA Library Prep Kit for Illumina (NEB, USA). The purified library products were evaluated using the Agilent 2200 TapeStation and Qubit ® 2.0 (Life Technologies, USA). The libraries were paired-end sequenced (PE150, Sequencing reads were 150 bp) at Guangzhou MethylGene Co., Ltd. (Guangzhou, China) using the Illumina Xten platform.

Pre-processing of sequencing reads/ quality control
Raw fastq sequences were treated with Trimmomatic tools (v 0.36) using the following options: TRAILING: 20, MINLEN:235 and CROP:235, to remove trailing sequences below a Phred quality score of 20 and to achieve uniform sequence lengths for downstream clustering processes. Sequencing read quality was inspected using the FastQC software. Adapter removal and read trimming were performed using Trimmomatic. Sequencing reads were trimmed from the end (base quality less than Q20) and filtered by length (less than 25).

Quantification of gene expression level
Paired-end reads were aligned to the human reference genome (hg19) with HISAT2. HTSeq v0. 6.0 was used to count the numbers of reads mapped to each gene. The whole sample expression levels were presented as RPKM (expected number of Reads Per Kilobase of transcript sequence per Million base pairs sequenced), which is the recommended and most common method to estimate the level of gene expression.

Differential expression analysis
The statistically significant DE genes were obtained by an adjusted P-value threshold of<0.05 and |log2(fold change) | > 1 using the DEGseq software. Finally, a hierarchical clustering analysis was performed using the R language package gplots according to the RPKM values of differential genes in different groups. And colors represent different clustering information, such as the similar expression pattern in the same group, including similar functions or participating in the same biological process.

GO terms and KEGG pathway enrichment analysis
All differentially expressed mRNAs were selected for GO and KEGG pathway analyses. GO was performed with KOBAS3.0 software, including cellular component (CC), molecular function (MF) and biological process (BP). GO provides label classification of gene function and gene product attributes (http://www. geneontology.org). GO analysis covers three domains: cellular component (CC), molecular function (MF) and biological process (BP). The differentially expressed mRNAs and the enrichment of different pathways were mapped using the KEGG pathways with KOBAS3.0 software (http://www.genome.jp/kegg).

Statistical analysis
The median and quartile were used for the statistical and data description of the normally distributed measures, and the number of cases (n) and percentages (%) were used for the statistical and data description of the categorical counts. Normality and homogeneity of all data were evaluated by Kolmogorov-Smirnov test. Student T-test or Mann-Whitney U test was applied to compare the differences of continuous variables. Pearson Chisquare test was employed to evaluate statistical differences of categorical variables. The Wilcoxon test was used to compare the continuous non-normally distributed variables between 6 Tibetans and 6 Hans in the exploratory cohort. Pearson correlation was used to identify the 14 overlapping genes and clinical characteristics with significant differences. All data were statistically analyzed by SPSS 23.0 software (SPSS Inc., Chicago, IL, USA). Graphs were generated using Graphpad 7.0 software (GraphPad Software, Inc., San Diego, USA).

The demographical and clinical characteristics between Tibetan and Han T2DM populations
A total of 120 participants were enrolled for the final analysis, including 60 Tibetans and 60 Hans. The patient flow chart is demonstrated in Figure 1. The basic and biochemical characteristics are shown in Table 1. Although no difference was observed in age, gender, and duration of T2DM, the BMI of Tibetans was significantly higher than Hans (26.08 vs 23.3, P = 0.017). Tibetans consume fewer refined grains (141.5 g/day vs 193.5 g/day, P< 0.001), vegetables and fruit (91 g/day vs 296.5 g/day, P< 0.001) than Han people, but they consume more coarse grains (171 g/day vs 63.5 g/day, P< 0.001), meat (181.5 g/day vs 100.5 g/day, P< 0.001), and yak butter (98.5 g/day vs 0 g/day, P< 0.001). Not surprisingly, the Hb level of Tibetans is higher than Hans (146.5 g/L vs 138.5 g/L, P< 0.001) due to the high-altitude, low-oxygen environment of Tibetan settlements. Despite similar BG, insulin, and C-peptide level, HbA1c of Tibetan T2DM patients was higher than Han patients (9.75% vs 8.65%, P = 0.001). Similarly, LDL level is significantly higher in Tibetan group compared to Han group (3.12 mmol/L vs 2.53 mmol/L, P = 0.002). Regarding the liver function, the blood tests also showed higher levels of ALT (30.5 IU/ L vs 21.5 IU/L, P = 0.013), and GGT (38 IU/L vs 21 IU/L, P< 0.001). The level of BUN was lower (4.95 mmol/L vs 5.61 mmol/L, P = 0.002) and eGFR of Tibetan T2DM patients was statistically higher than Han patients (129.77 mL/min vs 96.5 mL/min, P< 0.001). There were no significant differences in other parameters of biochemical tests between the two groups.
A total of 12 patients with 6 in each group were selected by paired design for the exploratory cohort. The age ranged from 33 to 54 years Flow chart of patient collection. COPD: chronic obstructive pulmonary disease. old and the duration of T2DM ranged from 2 to 7.1 years. As shown in Supplementary Table S1, there were no significant differences in basic and biochemical parameters except HbA1c (

Differentially methylated positions and regions
The whole-genome DNA methylation was detected by RRBS using peripheral blood samples from 6 Tibetan and 6 Han T2DM patients. After sulphite treatment, the conversion efficiency of all samples ranged from 98.82% to 99.27%. About 80% to 90% mCs were CG dinucleotides while about 10% to 20% were at CHG and CHH sites (G = A, C or T) (Supplementary Figure S1). Additionally, the methylation level of mC was around 80% to 100% while mCHG and mCHH were around 0% to 20%, with 20% as an interval (Supplementary Figure S2). We also explored the methylation levels in different genome regions. The level of methylation decreased in the 2kb upstream of transcription initiation but rose sharply in the exon region and reaches a maximum in the intron and 2kb downstream of genes (Supplementary Figure S3). The DMRs were mainly located in the intergenic region, accounting for 38.83%, followed by intron (32.15%) and exon regions (10.38%), respectively, in addition to 6.7% of DMRs within the gene promoter region (upstream 2kb) (Figure 2A). PCA found distinct clusters for study subjects ( Figure 2B). The heatmap ( Figure 2C) and volcano map ( Figure 2D) have demonstrated the methylation difference between the two groups. Compared with Han group, we identified 5178 hypomethylated regions and 4787 hypermethylated regions in Tibetans (Table 2).
We performed GO functional analysis according to DMR-related genes, which were mostly enriched in protein binding (BP), nucleus (CC), cytoplasm (CC) and membrane (CC) ( Figure 3A). KEGG analysis showed that DMR-related genes are mainly involved in metabolic pathway, pathways in cancer, cAMP signaling pathway, HTLV-I infection, cytokine-cytokine receptor interaction, calcium signaling pathway, alcoholism, regulation of actin cytoskeleton, hippo signaling pathway, Wnt signaling pathway, non-alcoholic fatty liver disease (NAFLD), insulin secretion, glycerophospholipid metabolism and type 2 diabetes mellitus ( Figure 3B).

Transcriptome analysis
We conducted RNA-seq on peripheral blood samples from Han and Tibetan T2DM patients in order to investigate the relationship between DNA methylation and gene expression. Each sample produced about 8 giga bases (Gb) of filtered data. Additionally, using HISAT2 software, sequencing data were compared to the human reference genome with an average match rate of 90.2% per sample and an average unique mapping rate of 86.95% (Supplementary Table S2). Gene expression levels are calculated by RPKM as the number of reads per kilobase length from a given gene per million reads, and are calculated as RPKM =

Total exon reads
Mapped readsÂexon length (31). A volcano map of significantly differentially expressed genes (DEGs) was created by differential gene expression analysis using the DESeq program, with 523 genes significantly up-regulated and 424 genes significantly down-regulated in the Tibetan group compared to the Han group ( Figure 4A). The heat map revealed distinct gene expression patterns in the Tibetan and Han populations ( Figure 4B).
Functional annotation showed that the most represented GO categories for DEGs were extracellular (CC), receptor-mediated endocytosis (BP), xenobiotic metabolic process (BP), negative regulation of endopeptidase activity (BP) and cellular response to hormone stimulus (BP) (Figure 5A), while KEGG enrichment analysis showed that the upregulated DEGs were mainly involved in steroid hormone biosynthesis, retinol metabolism, drug metabolism-cytochrome P450, PI3K-AkT signaling pathway, pentose and glucuronate interconversions, starch and sucrose metabolism, ascorbate and aldarate metabolism ( Figure 5B).

Integrative analysis of transcriptome and DNA methylation
In general, gene expression is negatively correlated with DNA methylation. We divided each sample into four categories, including silence, low expression, medium expression, and high expression, according to the amount of gene expression and counted the methylation levels in the gene regions of each of the four categories of genes in a single sample. Our results showed DNA methylation was negatively correlated with gene expression in regions within 1k upstream of the gene, and genes with high methylation status were not expressed or were underexpressed ( Figure 6). We observed 112 overlapping DEGs and DMR genes, of which 14 were promoter-related genes ( Table 3). The GO enrichment analysis showed that the most significant enriched GO terms of overlapped genes are integral component of membrane (CC), plasma membrane (CC), homophilic cell adhesion via plasma membrane adhesion molecules (BP), calcium ion binding (BP) ( Figure 7A).
According to KEGG enrichment analysis, these overlapping genes were primarily involved in metabolic pathway, including metabolism of xenobiotics by cytochrome P450, steroid hormone biosynthesis, retinol metabolism, ascorbate and aldarate metabolism, pentose and glucuronate interconversions, porphyrin and chlorophyll metabolism, drug metabolism, starch and sucrose metabolism, chemical carcinogenesis, PI3K-Akt signaling pathway, MAPK signaling pathway, pathways in cancer and Rap1 signaling pathway ( Figure 7B). The relationship between overlapping genes and significant clinical characteristics was analyzed by Pearson correlation analysis. We found that the HbA1c was associated with the expression of RHOD (R = 0.697, P< 0.

Discussion
Our study reported the clinical characteristics of Han and Tibetan T2DM patients, indicating that the same disease has clinical differences between various ethnic groups and providing evidence for clinical individualization of T2DM treatment. We also revealed for the first time the differences in DNA methylation and RNA expression between Tibetan and Han T2DM patients, and synthesized the relationship between them, which provides a basis for further exploration of T2DM development mechanisms and identification of therapeutic targets.
Tibetans live in a high altitude, low oxygen, low temperature environment. Previous studies have shown that in a healthy population of Han Chinese and Tibetans living at the same altitude, the hemoglobin concentration of highland Han is higher than that of Tibetans (32). However, Han Chinese living at lower altitudes have lower hemoglobin concentrations (33), which is consistent with our results. The diet of Tibetans consists mainly of coarse grains, meat, yak butter and other high-fat, highcalorie, high-protein foods, thus have a higher BMI. However, no differences were shown in other lipid indicators in our crosssectional cohort, except for higher LDL level in the Tibetan T2DM group, which indicates that Tibetans may have higher metabolism.
The Han and Tibetan populations also showed slight differences in liver function, with Tibetans having higher ALT and GGT levels,  but both in the normal range. Aminotransferases are considered indicators of hepatocyte health, and GGT also reflects biliary tract function. Elevated ALT is associated with age, obesity, elevated triglyceride levels, and low HDL cholesterol levels, but not with glycemic control (34). However, independent of common risk factors, ALT (35,36) and GGT (37) are linked to an increase in the risk of T2DM. Although the eGFR level of Tibetan T2DM patients was higher and the BUN was lower than that of Han Chinese, both were at normal levels.
In the exploratory cohort, we further investigated the differences in DNA methylation and transcriptome between Han and Tibetan populations to interpret the differences in the development of T2DM between the two groups through a genetic perspective. The CpG island is a region of the DNA sequence rich in CpG sites, usually A B FIGURE 4 The located in the promoters with an unmethylated state. When CpG islands are methylated, transcription factors become impaired in binding to promoters or bind to transcriptional repressors, altering the structure of chromatin. As a result, gene expression is altered without the changes of DNA sequence, affecting biological processes and leading to diseases (38, 39). In our study, a total of 1613 genes with DMRs were found between Han Chinese and Tibetan T2DM patients. After GO and KEGG functional enrichment analysis, we identified signaling pathways that affect metabolism and other pathways that may play a key role in the development of T2DM, such as insulin secretion. Among them, cAMP signaling pathway, Wnt signaling pathway, and Hippo signaling pathway were more significant and relevant. cAMP is an intracellular mediator of insulin and adrenal glycogen catabolism in the liver (40). In mammals, cAMP activates cAMP-dependent protein kinase (PKA), which phosphorylates downstream protein targets and then regulates the function of ion channels, transcription factors and enzymes. Meanwhile, the cAMP signaling pathway regulates glucose homeostasis due to insulin secretion, glucose utilization, and glycogen synthesis and catabolism (41). The Wnt signaling plays an important role as an evolutionary pathway in regulating cellular homeostasis and energy homeostasis from the hypothalamus to the metabolic organs. The classical Wnt as well as non-classical Wnt pathways inhibit metabolism and lead to increased adipose tissue, resulting in metabolic stress and metabolic inflammation and obesity (42). The Hippo signaling pathway plays a role in pancreatic, hepatic, adipose and cardiac cells as well as in systemic metabolism, regulating glucolipid metabolism. Activation of the Hippo signaling pathway in hyperglycemic states induces proliferation and differentiation of pancreatic b-cells, increasing glucose uptake and utilization, thereby reducing insulin resistance, and improving insulin secretion (43).
In general, gene expression follows an opposite trend to the level of methylation in the promoter region. In the present study, we identified 947 differentially expressed genes, of which 112 overlapping genes had differential methylation levels, and a total of 14 genes with differentially methylated regions in the promoter region. Among the differentially expressed genes found to be differentially methylated in promoter regions, APOB encodes apolipoprotein B and is associated with LDL, celiac and LDL structural integrity, in lipid digestion, mobilization as well as transport (44). A study on the amount of noninsulin-dependent diabetic patients showed that APOB polymorphisms were effective in improving blood glucose and lipid levels of T2DM patients (45). PAX8-AS1 is a non-coding RNA that is involved in the pathology of the disease despite its inability to encode protein synthesis. In a study examining non-coding RNA in leukocytes from patients with gestational diabetes mellitus (GDM), PAX8-AS1 expression levels were significantly lower in GDM patients compared to healthy pregnant women and could be used as a diagnostic biomarker for GDM (46). The rest of key genes need to be further studied in the future.
to increase glucose utilization, reduce glucose metabolism in the liver and muscle, and regulate the balance of lipid and glucose metabolism. However, impairment of this pathway leads to insulin resistance, which in turn worsens this pathway, leading to T2DM (52). In addition, insulin can also activate MAPK pathways but inappropriate MAPK signaling contributes to the development of metabolic syndrome and T2DM (53). An in vitro study showed that activated Rap1 is a key regulator of b-cell function, as evidenced by the promotion of glucose-stimulated insulin production, islet cell hypertrophy, and islet cell proliferation by activated Rap1A (54).
The overlapping genes primarily are associated with metabolism and insulin-related pathways, suggesting that the environment and lifestyle, such as diet, may play a role in altering DNA methylations levels, therefore affecting metabolism and insulin secretion and utilization in T2DM patients.

Conclusion
As the prevalence of T2DM varies in different ethnic groups in China, our study revealed the diverse clinical features of Tibetan and Han T2DM patients. The epigenetic and transcriptional patterns have provided a perspective on the mechanisms of T2DM in different ethnic groups, and the key genes are worthy to be further studied to reveal the importance of DNA methylation for the development of T2DM.

Data availability statement
The datasets presented in this study can be found in online repositories. The name of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm.nih.gov/, accession number: PRJNA911064.

Ethics statement
The studies involving human participants were reviewed and approved by Hospital of Chengdu University of Traditional Chinese Medicine. The patients/participants provided their written informed consent to participate in this study.

Author contributions
JL and XW conceived of the presented idea, conducted the study of cross-sectional cohort. XW carried out the exploratory study of epigenome and transcriptome. QC and QW supervised the project. All the authors contributed to the final version of the manuscript.

Funding
The study was supported by Sichuan Provincial Science and Technology Plan Project (19YYJC1515) and Special Training Program for Outstanding Young Scientific and Technological Talents (Innovation) of Chinese Medical Sciences (Grant No. ZZ14-YQ-010).

Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Publisher's note All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.